Developer(s) | Citrusleaf, Inc. |
---|---|
Stable release | 2.0.23 / September 1, 2010 |
Written in | C |
Operating system | Linux |
Type | distributed key/value database system |
License | Enterprise (Perpetual or Subscription based) |
Website | www.citrusleaf.net |
The Citrusleaf database is an ACID-compliant, post-relational NoSQL database produced and marketed by Citrusleaf, Inc. It was originally developed for managing the mission-critical data for applications on the Real-time web. These applications require the ability to store 5 to 10 Kilobytes of information on hundreds of millions of webs users and compare it to potential ads to display with sub-millisecond response time. Citrusleaf takes advantage of the properties of Solid-state drive (SSD) to accomplish this. As of 2010 Citrusleaf has been implemented in production.
Contents |
While at Yahoo! and Aggregate Knowledge, the founders of Citrusleaf Corporation encountered a problem. The volume and performance demands of Real-time web applications caused traditional SQL databases to fail. This was due to several reasons. The first was the sheer volume of data. Keeping track of 5 to 10 Kilobytes of information for each of hundreds of millions of people produced a database with billions of objects. Retrieving and processing this information with sub-millisecond response time was impossible with traditional database approaches. Traditional databases approaches were designed with traditional rotational disk storage in mind. The average seek time of rotating disk storage is ten milliseconds and therefore a sub-millisecond response time is not possible.
The answer lay in making use of solid state drives SSD. In addition to performance, Fault-tolerant design was an issue. Their applications were mission-critical so in addition to the performance requirements the solution had to be available without interruption. Therefore in 2008 Brian Bulkowski created a key-value data store and later was joined by Srini Srinivasan in 2009. Together they created the Citrusleaf database platform. The Citrusleaf database platform is an ACID-compliant, extremely fast, scalable, fault-tolerant database engine. The system is capable of 100,000 transactions per second per node, with a response time of under one millisecond. To support these transaction loads in a non-stop manner during node arrivals and departures, the authors created software solutions in the areas of distributed systems, real-time prioritization, and storage management across all kinds of storage.
Citrusleaf organizes all data into namespaces. These namespaces are similar to a database instance in an RDBMS, and control policies like replication count and storage location. Within a namespace, individual data objects are referenced by tables and primary keys which could be strings, integers, or binary data. A key is a unique reference to a piece of data: common keys include usernames and session identifiers.
Each data object is a collection of 'bins' in Citrusleaf's parlance, which are similar to column names in SQL. The system is schema-less in that different columns can be used in different data objects of the same table. Each column's value is typed. The types supported are strings, integers, blobs, and "reflection blobs", which are binary data which has been reflected by the serializer of an individual object (such as a Java blob generated by Java's serializer). The use of typed values allows different languages to inter-operate simply: a string set in Java will appear correctly through the Python client, even though Java and Python use different underlying character representations (Unicode vs UTF-8).
Some high level operations (such as atomically adding integers) are supported, in the style of Redis, but the set of instructions is not very rich.
Citrusleaf's data model allows it to be considered as a document store, although it is more similar to a schema-less version of the row based schema typically used in relational systems.